Àá½Ã¸¸ ±â´Ù·Á ÁÖ¼¼¿ä. ·ÎµùÁßÀÔ´Ï´Ù.
KMID : 1022420150070040003
Phonetics and Speech Sciences
2015 Volume.7 No. 4 p.3 ~ p.9
Input Dimension Reduction based on Continuous Word Vector for Deep Neural Network Language Model
Kim Kwang-Ho

Lee Dong-Hyun
Lim Min-Kyu
Kim Ji-Hwan
Abstract
In this paper, we investigate an input dimension reduction method using continuous word vector in deep neural network language model. In the proposed method, continuous word vectors were generated by using Google¡¯s Word2Vec from a large training corpus to satisfy distributional hypothesis. 1-of-|V| coding discrete word vectors were replaced with their corresponding continuous word vectors. In our implementation, the input dimension was successfully reduced from 20,000 to 600 when a tri-gram language model is used with a vocabulary of 20,000 words. The total amount of time in training was reduced from 30 days to 14 days for Wall Street Journal training corpus (corpus length: 37M words).
KEYWORD
deep neural network, language model, continuous word vector, input dimension reduction
FullTexts / Linksout information
Listed journal information
ÇмúÁøÈïÀç´Ü(KCI)